Input data and output of research conducted in the study described in the paper:F. Kunneman and A. Van den Bosch (2016), Open-domain extraction of future events from Twitter, Natural Language Engineering, doi: 10.1017/S1351324916000036The paper describes a system that extracts future referring time expressions and entities from Twitter messages, and subsequently detects events as a pair of a date and entity the are often mentioned in the same tweet. This dataset features the ids of a large set of Dutch tweets posted in August 2014, which was used as input to the system, as well as the time expression and / or entity that was extracted from each tweet, if any. Furthermore, the detected events are included, represented as a date, one or more describing terms, the tweetids that refer to it and the assessment of the event by human annotators.
展开▼
机译:在论文中描述的研究中进行的研究的输入数据和输出:F. Kunneman和A.Van den Bosch(2016),从Twitter的未来事件的开放域提取,自然语言工程,doi:10.1017 / S1351324916000036本文描述了一种从Twitter消息中提取未来的引用时间表达式和实体并随后检测事件的系统在同一条推文中经常提到一对日期和实体。此数据集包含2014年8月发布的大量荷兰推文的ID(用作系统输入),以及从每个推文中提取的时间表达和/或实体(如果有)。此外,还包括检测到的事件(表示为日期),一个或多个描述性术语,引用该事件的推文以及人工注释者对该事件的评估。
展开▼